Scroll down to see el explanationi.
makeNseFunction1 <- function(fun) {
function(data, elementOrFormula, ...) {
functionEnvironment = environment()
if (as.character(substitute(elementOrFormula)) %in% names(data)) {
argument = data[[deparse(substitute(elementOrFormula))]]
} else {
allVariables = all.vars(elementOrFormula)
for (variable in allVariables) {
assign(variable, eval(as.name(variable), data), envir = functionEnvironment)
}
argument = elementOrFormula
environment(argument) = functionEnvironment
}
fun(argument, ...)
}
}
makeNseFunction2 <- function(fun) {
function(data, elementOrFormula, ...) {
library(rlang)
functionEnvironment = environment()
if (as.character(substitute(elementOrFormula)) %in% names(data)) {
elementOrFormula = substitute(elementOrFormula)
argument = eval(enexpr(elementOrFormula), data)
} else {
allVariables = all.vars(elementOrFormula)
for (variable in allVariables) {
assign(variable, eval(as.name(variable), data))
}
argument = elementOrFormula
environment(argument) = functionEnvironment
}
fun(argument, ...)
}
}
makeNseFunction3 <- function(fun) {
function(data, elementOrFormula, ...) {
functionEnvironment = environment()
if (as.character(substitute(elementOrFormula)) %in% names(data)) {
argument = eval(substitute(elementOrFormula), data)
} else {
allVariables = all.vars(elementOrFormula)
for (variable in allVariables) {
assign(variable, eval(as.name(variable), data))
}
argument = elementOrFormula
environment(argument) = functionEnvironment
}
fun(argument, ...)
}
}
Each of the functions implements a different way to do a simple thing.
CREATING AN NSE VERSION OF A GIVEN FUNCTION WHICH RETRIVES A FIELD USING NON-STANDARD EVALUATION WITH BUILT-IN R MECHNISMS OR EXTERNAL METHODSSounds evil?
Well it is. But it is quite easy to understand.
The thing is that in R you can get an element from a e.g. list like so:
myList = list(a = 1, b = 2, c = 3)
myList$a
## [1] 1
BUT. Have you ever wondered what actually is “a” here? The one in the second line.
Let’s check it:
a
… Actually I cannot check it, because my markdown wouldn’t compile.
I would get an error saying that “a” is not found. Well, it was not defined, so it should NOT be found.
How does it work in the first example then? Let’s call it R magic for now.
Another question:
What if I wanter to pass “a” as a parameter to a function and use the $ operator inside my function like so:
myList = list(a = 1, b = 2, c = 3)
myFunction = function(a, myList){
myList$a
}
myFunction(a, myList)
## [1] 1
WOW, it works with no problems.
It’s R - it only looks like it works.
Let’s see the example where I want to get “b”:
myList = list(a = 1, b = 2, c = 3)
myFunction = function(a, myList){
myList$a
}
myFunction(b, myList)
## [1] 1
Oops.
I am not going to go into detail, but the problem is related to passing the argument at it changing it’s metaparameters (or at least that’s how I understand it).
Long story short, to fix this error we got to do something like this:
myList = list(a = 1, b = 2, c = 3)
myFunction = function(a, myList){
eval(substitute(a), myList)
}
myFunction(b, myList)
## [1] 2
What what what. What happend here?
In very very basic words we can say that:
“Function eval evaluates an expression using the given object (here myList)”
What about “substitute”?
It simple “retrives” the original variable name and uses it in the same way as in:
myList$a
## [1] 1
If you want a proper explanation I recommend checking those example: Examples
Or these explanations:
I also recommend listening to Hadley Wickham’s 5 (actually 6) minute talk about “Tidy evaluation”. It really helped me NOT to kill myself in this process. Hope it helps you too.
Getting back to our NSEketeers.
What you saw at the beginning is a few implementations which should create a NSE (NonStandard Evaluation) function from a non-NSE function.
We want this call:
min(myList$a)
to be equal to this call
min_NSE(myList, a)
They also take into account formulas but that would be too much to explain at once, so we’ll skip it.
We are here to check their time efficiency.
I also wanted to check their memory usage but… let’s see this Stack Overflow answer.
So onto the testing we go!
testedFunction = min
datasetSize = 100
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))
testedFunction = min
datasetSize = 10000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))
testedFunction = max
datasetSize = 10000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))
testedFunction = mean
datasetSize = 10000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))
testedFunction = mean
datasetSize = 1000000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))
testedFunction = mean
datasetSize = 1000000
dataset = list(a = sample(x = 1:100, size = datasetSize, replace = TRUE))
testedFunction = lm
datasetSize = 100
dataset = list(x = sample(x = 1:100, size = datasetSize, replace = TRUE), y = sample(x = 1:100, size = datasetSize, replace = TRUE))
formula = x ~ y